COD::CIF::Parser: an error-correcting CIF parser for the Perl language.

نویسندگان

  • Andrius Merkys
  • Antanas Vaitkus
  • Justas Butkus
  • Mykolas Okulič-Kazarinas
  • Visvaldas Kairys
  • Saulius Gražulis
چکیده

A syntax-correcting CIF parser, COD::CIF::Parser, is presented that can parse CIF 1.1 files and accurately report the position and the nature of the discovered syntactic problems. In addition, the parser is able to automatically fix the most common and the most obvious syntactic deficiencies of the input files. Bindings for Perl, C and Python programming environments are available. Based on COD::CIF::Parser, the cod-tools package for manipulating the CIFs in the Crystallography Open Database (COD) has been developed. The cod-tools package has been successfully used for continuous updates of the data in the automated COD data deposition pipeline, and to check the validity of COD data against the IUCr data validation guidelines. The performance, capabilities and applications of different parsers are compared.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

iotbx.cif: a comprehensive CIF toolbox

iotbx.cif is a new software module for the development of applications that make use of the CIF format. Comprehensive tools are provided for input, output and validation of CIFs, as well as for interconversion with high-level cctbx [Grosse-Kunstleve, Sauter, Moriarty & Adams (2002). J. Appl. Cryst.35, 126-136] crystallographic objects. The interface to the library is written in Python, whilst p...

متن کامل

Feature Engineering in Persian Dependency Parser

Dependency parser is one of the most important fundamental tools in the natural language processing, which extracts structure of sentences and determines the relations between words based on the dependency grammar. The dependency parser is proper for free order languages, such as Persian. In this paper, data-driven dependency parser has been developed with the help of phrase-structure parser fo...

متن کامل

Studying impressive parameters on the performance of Persian probabilistic context free grammar parser

In linguistics, a tree bank is a parsed text corpus that annotates syntactic or semantic sentence structure. The exploitation of tree bank data has been important ever since the first large-scale tree bank, The Penn Treebank, was published. However, although originating in computational linguistics, the value of tree bank is becoming more widely appreciated in linguistics research as a whole. F...

متن کامل

An Error Correcting Parser for Context Free Grammars that Takes Less Than Cubic Time

The problem of parsing has been studied extensively for various formal grammars. Given an input string and a grammar, the parsing problem is to check if the input string belongs to the language generated by the grammar. A closely related problem of great importance is one where the input are a string I and a grammar G and the task is to produce a string I ′ that belongs to the language generate...

متن کامل

Some Topics in Parser Generation

An algorithm has been developed that generates an error-correcting recursive-descent syntax analyzer (parser) with no backtrack from an extended context-free grammar. A program, LLGEN, has been written to implement this algorithm. The paper discusses three aspects of the algorithm: the support of separate compilation, the mechanism for the static or dynamic resolving of conflicts, and the error...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of applied crystallography

دوره 49 Pt 1  شماره 

صفحات  -

تاریخ انتشار 2016